{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CS 237 Spring 2021, HW 02 \n", "\n", "#### Due date: Thursday February 11th at Midnight (1 minute after 11:59pm on 2/11) via Gradescope (6 hour grace period)\n", "\n", " Late policy: You may submit the homework up to 24 hours late for a 10% penalty. Hence, the late deadline is Friday 2/2 at Midnight (with the same 6 hour grace period). \n", "\n", "#### General Instructions\n", "\n", "Please complete this notebook by filling in solutions where indicated. Be sure to \"Restart and Run All\" from the Kernel menu before submitting to Gradescope. \n", "\n", "There are 8 analytical problems and 4 programming problems. An introductory video will be posted on YT Friday afternoon for\n", "the analytical problems, and the programming problems will be covered Friday in lab. \n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Here are some imports which will be used in code that we write for CS 237\n", "\n", "# IMPORTANT: DO NOT MAKE ANY OTHER IMPORTS WITHOUT DISCUSSING WITH PROFESSOR SNYDER\n", "\n", "# Imports used for the code in CS 237\n", "\n", "import numpy as np # arrays and functions which operate on arrays, plus math functions\n", "import matplotlib.pyplot as plt # normal plotting \n", "import math # You may use the math library if you really want, but\n", " # I recommend you use the numpy library for all mathematical operations.\n", " # Examples of use are in the notebook NumpyTutorial.ipynb on the class web site. \n", "\n", "\n", "from collections import Counter # Essential for creating distributions from experiments\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analytical Problems\n", "\n", "The first 4 problems are from the textbook End of Chapter Problems for Chapter 1 here. If only certain parts of a problem are specified, then (naturally) just do those parts; if no such restriction is specified, you must do all parts of the problem. \n", "\n", "\n", "For Problems 5, 6, and 8, **Analyze** means to specify \n", "
my_random_integer()
to floating-point numbers in the range $[0..1).$\n",
"\n",
"\n",
" Again, we are writing our own version of the Numpy function np.random.random()
which you used in the first homework. You may NOT use the Numpy Random library; we are learning how it was written! \n",
" That is why we added \"my_\" to the names of the functions in this homework. After this homework, we will go back\n",
" to use `numpy` functions, but then you will know how they are written!\n",
"
\n",
"\n",
"\n",
"Hint: Easily done by division! You won't get all possible floating-point numbers, just\n",
"$m$ floats evenly spaced throughout the interval $[0..1)$. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"a = 1103515245\n",
"c = 12345\n",
"m = 2**31 \n",
"\n",
"def my_random():\n",
" pass # Your code here \n",
"\n",
"# Run this cell several times and verify that it works as you expect; try it with and\n",
"# without the seed\n",
"\n",
"my_seed(0)\n",
"\n",
"my_random() # with seed 0, should get 5.748588591814041e-06"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# Now test it by printing out 20 random floats in the range [0..1) using a seed of 5.\n",
"# Try it with and without the seed, but make sure the seed is uncommented before you\n",
"# Run All and submit\n",
"\n",
"# with seed 5, first number should be 0.5693273963406682\n",
"# with seed 5, last number should be 0.7451900173909962\n",
"\n",
"# Your code here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### (C) Pseudo-random Integers in range [lo..hi)\n",
"\n",
"Now we will use the `my_random()` function created in (B) to create random integers in a range $[lo..hi)$\n",
"as if imitating np.random.randint(lo,hi)
. Note that lo
is an optional argument,\n",
"so we want the following behavior, following np.random.randint(...)
:\n",
"\n", "This is a bit tricky, because the first argument is optional, where the second is not.\n", "The best way to do this is to collect the arguments in a list (see the Optional Arguments section right at the beginning of PythonFor237.ipynb for\n", "how to do this) and figure out how many and what\n", "they mean once you call the function. \n", "\n", "Having gotten the optional arguments to work, next we need to figure out how to\n", "create the appropriate values from\n", " np.random.randint(10) # returns a random value from [0..10) \n", " np.random.randint(1,7) # returns a random value from [1..7) \n", "
my_random()
. \n",
"\n",
"\n",
"To do this, you will need to take the value returned by my_random()
and (in this order):\n",
"\n",
" - scale $[0..1)$ to $[0..(hi - lo))$ by multiplication\n",
" - shift $[0..(hi - lo))$ to $[lo .. hi)$ by addition\n",
" - convert to integers using int(...)
. \n",
" \n",
"Note that the first two steps are a 1D version of what you did in 2D when you created the circle in Problem 5 in HW 01. \n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[None, None, None, None, None, None, None, None, None, None]\n",
"\n",
"[None, None, None, None, None, None, None, None, None, None]\n",
"\n",
"[None, None, None, None, None, None, None, None, None, None]\n",
"\n",
"[None, None, None, None, None, None, None, None, None, None]\n"
]
}
],
"source": [
"# Solution\n",
"\n",
"# my_randint(hi) returns a random integer in the range [0..hi)\n",
"# my_randint(lo,hi) returns a random integer in the range [lo..hi)\n",
"\n",
"def my_randint( *args ): \n",
" pass # Your code here \n",
" \n",
" \n",
"# Test itd using a seed of 0\n",
"\n",
"my_seed(0)\n",
"\n",
" \n",
"print( [ my_randint(10) for k in range(10) ] ) # should get [0, 6, 3, 6, 1, 5, 4, 6, 3, 2]\n",
" \n",
"print() \n",
"\n",
"print( [ my_randint(2) for k in range(10) ] ) # should get [0, 1, 0, 0, 1, 1, 1, 1, 0, 1]\n",
" \n",
"print() \n",
"\n",
"print( [ my_randint(1,7) for k in range(10) ] ) # should get [4, 2, 5, 5, 6, 2, 3, 5, 5, 2] \n",
" \n",
"print() \n",
"\n",
"print( [ my_randint(0,10) for k in range(10) ] ) # should get [0, 7, 0, 5, 2, 5, 5, 3, 1, 6] \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problem 10: Testing for Randomness\n",
"\n",
"In this problem we will try three different tests for whether our functions are behaving\n",
"randomly. These are rather informal, since all we will do is manipulate the random\n",
"sequence in various ways and verify by inspection that it \"looks random.\"\n",
"\n",
"Essentially, we will be repeating Problems 1 (B) and 3 from HW 01, but using our own versions of the random number generators. \n",
"\n",
"For further information on tests of randomness, see this Wikipedia page. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Part (A) The Pair Test.\n",
"\n",
" Now we will test our function my_random()
developed above. For the first test, you must copy your solution to the Problem 1 (B) in HW 01, \n",
" and run it with your functions my_seed(0)
and my_random()
, instead of the numpy functions we used there. You do NOT need to provide an estimate of $\\pi$; all we are interested in is your visual\n",
" inspection of the distribution of the points in the unit square. \n",
" \n",
" I will provide a solution to HW 01 on Saturday, after the late deadline has passed, in case you want to use my solution. \n",
"\n",
" \n",
"Use a seed of 0 and let `num_trials = 5000`. \n",
" \n",
"Confirm by eye that the points appear to be random. \n",
" \n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"a = 1103515245\n",
"c = 12345\n",
"m = 2**31 \n",
"\n",
"# Your code here "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### (B) The Frequency Test\n",
"\n",
"For this part, we would like you to do a frequency test. This is essentially the same as Problem 3 of HW 01 where we determined whether the probability function was equinumerous (that is, all the bins were about the same height). Essentially, what this amounts to is displaying the frequency distribution and examining its shape to verify that it conforms to what you would expect (of course, it will be an approximation of the ideal shape, which\n",
"would get more precise as you do more trials). \n",
"\n",
"There is a subtle problem with this test and its relationship with the period of the random number generator: if you generate exactly as many numbers as in the period of `my_random_integer()`, then by definition the bins will be exactly the same size, since the sequence contains all possible numbers in the range $[0 .. 2^{31})$. (You could verify this by doing $m=2^{31}$ trials if you wish!)\n",
"\n",
"Another way to think of this is that the sequence is \"more random\" near the beginning than near the end, since near the end\n",
"it can only choose numbers in the range $[0..2^{31})$ that have not yet been generated. The last number generated in the period is the seed, and hence not random at all!\n",
"\n",
"Hence we do not want to generate too much of the entire period to do this test. We will therefore use only $10^6$ numbers, which represent\n",
"\n",
"$$\\frac{10^6}{2^{31}} \\, =\\, 4.6566128730773926\\text{e-04}$$\n",
" \n",
"or about 0.046 % of the period. This should be sufficiently random to do the test!\n",
"\n",
"Todo:\n",
"\n",
"Generate $10^6$ values using my_randint(100)
with a seed of 0. \n",
" \n",
"Use `show_distribution(...)` from Problem 3 of HW 01 and simply observe that\n",
"the bins are approximately the same height, as in HW 01. \n",
"\n",
"(I will post the solution for this on Saturday, after the late deadline passes). \n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"# Solution\n",
"\n",
"# Your code here "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problem 11: Further Tests of Randomness\n",
"\n",
"One of the characteristics of random sequences is that if you perform some simple\n",
"transformation of the sequence, it should still be random. Various statistical tests of\n",
"randomness are based on this idea. In this problem, we will transform the random sequences\n",
"we generate using `my_randint(...)` to test for randomness in two ways (I have taken these from here) and test the derived sequences for randomness using the frequency test from the previous problem (which amounts to displaying the distribution of outcomes and examining its shape). \n",
"\n",
"*The serial test: Generate two random digits using `my_randint(10)` and form a two-digit number from them (e.g., if you generated 8 and 3, then produce 83). Using this method, generate a long sequence of two-digit numbers\n",
"and verify it is approximately equiprobable using the frequency test (as in Problem 10 (B)). \n",
"\n",
"*The gap test: Generate a sequence of 0's and 1's using `my_randint(2)` and find the distances between occurrences of 0's (00 would be a distance of 0, 010 would be a distance of 1, 01110 would be a distance of 3, etc.). \n",
"Using this idea, generate a long sequence of such distances, and verify using the frequency test (it will not be equiprobably, see next paragraph). \n",
"\n",
"How to proceed:\n",
" \n",
"The serial test is essentially identical to Problem 10 (B) and you should get a very similar result. Just write a function `generate_two_digit_integer()` and use it in place of `my_randint(100)`\n",
"\n",
"The gap test is essentially a trivial modification of the \"Flip a coin until you get a head\" experiment: 0 = heads, 1 = tails,\n",
"and you are counting the number of tails instead of the total number of heads and tails. Write a function `generate_until_0()` which count the number of 1's generated until you get the first 0. \n",
"You should get the following distribution:\n",
"\n",
" S = { 0, 1, 2, 3, ... }\n",
" P = { 1/2, 1/4, 1/8, 1/16, ... }"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"# (A) The Serial Test Use a seed of 5 and num_trials = 10**6\n",
"\n",
"# Your code here"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"# (B) The Gap Test Use a seed of 7 and num_trials = 10**6\n",
"\n",
"# Your code here\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problem 12: \n",
"\n",
"Verify the result you calculated for Analytical Problem 8 above (the \"flip until heads\" experiment returns an even number)\n",
"by running $10^6$ trials. The answer should be close to the result obtained by your mathematical analysis. \n",
"\n",
"You must \n",
"\n",
"(A) Print out the percentage of the total outcomes which were even (as a float, that is, 33 % would be 0.33), and \n",
"\n",
"(B) Display the distribution of even and odd outcomes (two bins, very simple!). To produce a nice, informative display, you must label the X-axis with \"even\" and \"odd\" (by resetting\n",
"the plt.xticks(...), see the `PythonFor237` notebook in the section on Bar Charts for an example). You will have to\n",
"modify the `show_distribution` function to do this, or simply copy the body of the function into the code cell for your solution. \n",
"\n",
"\n",
"\n",
"Hints: For (B) you will need to convert your\n",
"experimental outcomes to show whether the result was even or odd. The easiest\n",
"way is to convert all even outcomes to 0 and all odd outcomes to 1, using\n",
"a simple list comprehension. \n",
"\n",
"To find the percentage of 0's in the list of outcomes, use `Counter(...)`. \n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"# Solution (A) Use a seed of 2 and num_trials = 10**6\n",
"\n",
"\n",
"# Your code here\n",
"\n",
"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# Solution for (B) Use outcomes from (A) to draw the distribution\n",
"\n",
"# Your code here"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}